4 research outputs found
Embedded dynamic programming networks for networks-on-chip
PhD ThesisRelentless technology downscaling and recent technological advancements
in three dimensional integrated circuit (3D-IC) provide a promising
prospect to realize heterogeneous system-on-chip (SoC) and homogeneous
chip multiprocessor (CMP) based on the networks-onchip
(NoCs) paradigm with augmented scalability, modularity and
performance. In many cases in such systems, scheduling and managing
communication resources are the major design and implementation
challenges instead of the computing resources. Past research
efforts were mainly focused on complex design-time or simple heuristic
run-time approaches to deal with the on-chip network resource
management with only local or partial information about the network.
This could yield poor communication resource utilizations and amortize
the benefits of the emerging technologies and design methods.
Thus, the provision for efficient run-time resource management in
large-scale on-chip systems becomes critical. This thesis proposes a
design methodology for a novel run-time resource management infrastructure
that can be realized efficiently using a distributed architecture,
which closely couples with the distributed NoC infrastructure. The
proposed infrastructure exploits the global information and status
of the network to optimize and manage the on-chip communication
resources at run-time.
There are four major contributions in this thesis. First, it presents a
novel deadlock detection method that utilizes run-time transitive closure
(TC) computation to discover the existence of deadlock-equivalence
sets, which imply loops of requests in NoCs. This detection scheme,
TC-network, guarantees the discovery of all true-deadlocks without
false alarms in contrast to state-of-the-art approximation and heuristic
approaches. Second, it investigates the advantages of implementing
future on-chip systems using three dimensional (3D) integration and
presents the design, fabrication and testing results of a TC-network
implemented in a fully stacked three-layer 3D architecture using a
through-silicon via (TSV) complementary metal-oxide semiconductor
(CMOS) technology. Testing results demonstrate the effectiveness
of such a TC-network for deadlock detection with minimal computational
delay in a large-scale network. Third, it introduces an adaptive
strategy to effectively diffuse heat throughout the three dimensional
network-on-chip (3D-NoC) geometry. This strategy employs a dynamic
programming technique to select and optimize the direction of data
manoeuvre in NoC. It leads to a tool, which is based on the accurate
HotSpot thermal model and SystemC cycle accurate model, to simulate
the thermal system and evaluate the proposed approach. Fourth, it
presents a new dynamic programming-based run-time thermal management
(DPRTM) system, including reactive and proactive schemes, to
effectively diffuse heat throughout NoC-based CMPs by routing packets
through the coolest paths, when the temperature does not exceed
chip’s thermal limit. When the thermal limit is exceeded, throttling is
employed to mitigate heat in the chip and DPRTM changes its course
to avoid throttled paths and to minimize the impact of throttling on
chip performance.
This thesis enables a new avenue to explore a novel run-time resource
management infrastructure for NoCs, in which new methodologies
and concepts are proposed to enhance the on-chip networks for
future large-scale 3D integration.Iraqi Ministry of Higher Education and Scientific Research (MOHESR)
Thermal optimization in network-on-chip-based 3D chip multiprocessors using dynamic programming networks
The substantial silicon density in 3D VLSI, albeit its numerous advantages, introduces serious thermal threats that would lead to faults and system failures. This article introduces a new strategy to effectively diffuse heat from NoC-based 3D CMPs. Runtime Dynamic Programming Network (DPN) is proposed to optimize routing directions and provide silicon temperature moderation. Both on-chip reliability and computational performance have been improved by 63% and 27%, respectively, with the DPN approach. This work enables a new avenue to explore the adaptability for future large-scale 3D integration
A scalable turbo decoding algorithm for high-throughput network-on-chip implementation
Wireless communication at near-capacity transmission throughputs is facilitated by employing sophisticated Error Correction Codes (ECCs), such as turbo codes. However, real-time communication at high transmission throughputs is only possible if the challenge of implementing turbo decoders having equally high processing throughputs can be overcome. Furthermore, in many applications, turbo decoders are required to have the flexibility of supporting a wide variety of turbo code parametrizations. This motivates the implementation of turbo decoders using Networks-on-Chip (NoCs), which facilitate flexible and high-throughput parallel processing. However, turbo decoders conventionally operate on the basis of the Logarithmic Bahl-Cocke-Jelinek-Raviv (Log-BCJR) algorithm, which has an inherently-serial nature, owing to its data dependencies. This limits the exploitation of the NoC’s computing resources, particularly as the size of the NoC is scaled up. Motivated by this, we propose a novel turbo decoder algorithm, which eliminates the data dependencies of the Log-BCJR algorithm and therefore has an inherently-parallel nature. We show that by jointly optimizing the proposed algorithm with the NoC architecture, a significantly improved utility of the available computing resources is achieved. Owing to this, our proposed turbo decoder achieves a factor of up to 2.13 higher processing throughput than a Log-BCJR benchmarker